home *** CD-ROM | disk | FTP | other *** search
Wrap
Text File | 2002-10-03 | 49.4 KB | 1,387 lines
PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) NNNNAAAAMMMMEEEE perlembed - how to embed perl in your C program DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN PPPPRRRREEEEAAAAMMMMBBBBLLLLEEEE Do you want to: UUUUsssseeee CCCC ffffrrrroooommmm PPPPeeeerrrrllll???? Read the _p_e_r_l_x_s_t_u_t manpage, the _p_e_r_l_x_s manpage, the _h_2_x_s manpage, and the _p_e_r_l_g_u_t_s manpage. UUUUsssseeee aaaa UUUUnnnniiiixxxx pppprrrrooooggggrrrraaaammmm ffffrrrroooommmm PPPPeeeerrrrllll???? Read about back-quotes and about system and exec in the _p_e_r_l_f_u_n_c manpage. UUUUsssseeee PPPPeeeerrrrllll ffffrrrroooommmm PPPPeeeerrrrllll???? Read about the do entry in the _p_e_r_l_f_u_n_c manpage and the eval entry in the _p_e_r_l_f_u_n_c manpage and the require entry in the _p_e_r_l_f_u_n_c manpage and the use entry in the _p_e_r_l_f_u_n_c manpage. UUUUsssseeee CCCC ffffrrrroooommmm CCCC???? Rethink your design. UUUUsssseeee PPPPeeeerrrrllll ffffrrrroooommmm CCCC???? Read on... RRRROOOOAAAADDDDMMMMAAAAPPPP the section on _C_o_m_p_i_l_i_n_g _y_o_u_r _C _p_r_o_g_r_a_m the section on _A_d_d_i_n_g _a _P_e_r_l _i_n_t_e_r_p_r_e_t_e_r _t_o _y_o_u_r _C _p_r_o_g_r_a_m the section on _C_a_l_l_i_n_g _a _P_e_r_l _s_u_b_r_o_u_t_i_n_e _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m the section on _E_v_a_l_u_a_t_i_n_g _a _P_e_r_l _s_t_a_t_e_m_e_n_t _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m the section on _P_e_r_f_o_r_m_i_n_g _P_e_r_l _p_a_t_t_e_r_n _m_a_t_c_h_e_s _a_n_d _s_u_b_s_t_i_t_u_t_i_o_n_s _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m the section on _F_i_d_d_l_i_n_g _w_i_t_h _t_h_e _P_e_r_l _s_t_a_c_k _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m the section on _M_a_i_n_t_a_i_n_i_n_g _a _p_e_r_s_i_s_t_e_n_t _i_n_t_e_r_p_r_e_t_e_r the section on _M_a_i_n_t_a_i_n_i_n_g _m_u_l_t_i_p_l_e _i_n_t_e_r_p_r_e_t_e_r _i_n_s_t_a_n_c_e_s the section on _U_s_i_n_g _P_e_r_l _m_o_d_u_l_e_s, _w_h_i_c_h _t_h_e_m_s_e_l_v_e_s _u_s_e _C _l_i_b_r_a_r_i_e_s, _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m the section on _E_m_b_e_d_d_i_n_g _P_e_r_l _u_n_d_e_r _W_i_n_3_2 PPPPaaaaggggeeee 1111 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) CCCCoooommmmppppiiiilllliiiinnnngggg yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm If you have trouble compiling the scripts in this documentation, you're not alone. The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY THE SAME WAY THAT YOUR PERL WAS COMPILED. (Sorry for yelling.) Also, every C program that uses Perl must link in the _p_e_r_l _l_i_b_r_a_r_y. What's that, you ask? Perl is itself written in C; the perl library is the collection of compiled C programs that were used to create your perl executable (/_u_s_r/_b_i_n/_p_e_r_l or equivalent). (Corollary: you can't use Perl from your C program unless Perl has been compiled on your machine, or installed properly--that's why you shouldn't blithely copy Perl executables from machine to machine without also copying the _l_i_b directory.) When you use Perl from C, your C program will--usually--allocate, "run", and deallocate a _P_e_r_l_I_n_t_e_r_p_r_e_t_e_r object, which is defined by the perl library. If your copy of Perl is recent enough to contain this documentation (version 5.002 or later), then the perl library (and _E_X_T_E_R_N._h and _p_e_r_l._h, which you'll also need) will reside in a directory that looks like this: /usr/local/lib/perl5/your_architecture_here/CORE or perhaps just /usr/local/lib/perl5/CORE or maybe something like /usr/opt/perl5/CORE Execute this statement for a hint about where to find CORE: perl -MConfig -e 'print $Config{archlib}' Here's how you'd compile the example in the next section, the section on _A_d_d_i_n_g _a _P_e_r_l _i_n_t_e_r_p_r_e_t_e_r _t_o _y_o_u_r _C _p_r_o_g_r_a_m, on my Linux box: % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include -I/usr/local/lib/perl5/i586-linux/5.003/CORE -L/usr/local/lib/perl5/i586-linux/5.003/CORE -o interp interp.c -lperl -lm (That's all one line.) On my DEC Alpha running old 5.003_05, the incantation is a bit different: % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm PPPPaaaaggggeeee 2222 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) How can you figure out what to add? Assuming your Perl is post-5.001, execute a perl -V command and pay special attention to the "cc" and "ccflags" information. You'll have to choose the appropriate compiler (_c_c, _g_c_c, et al.) for your machine: perl -MConfig -e 'print $Config{cc}' will tell you what to use. You'll also have to choose the appropriate library directory (/_u_s_r/_l_o_c_a_l/_l_i_b/...) for your machine. If your compiler complains that certain functions are undefined, or that it can't locate -_l_p_e_r_l, then you need to change the path following the -L. If it complains that it can't find _E_X_T_E_R_N._h and _p_e_r_l._h, you need to change the path following the -I. You may have to add extra libraries as well. Which ones? Perhaps those printed by perl -MConfig -e 'print $Config{libs}' Provided your perl binary was properly configured and installed the EEEExxxxttttUUUUttttiiiillllssss::::::::EEEEmmmmbbbbeeeedddd module will determine all of this information for you: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` If the EEEExxxxttttUUUUttttiiiillllssss::::::::EEEEmmmmbbbbeeeedddd module isn't part of your Perl distribution, you can retrieve it from http://www.perl.com/perl/CPAN/modules/by- module/ExtUtils::Embed. (If this documentation came from your Perl distribution, then you're running 5.004 or better and you already have it.) The EEEExxxxttttUUUUttttiiiillllssss::::::::EEEEmmmmbbbbeeeedddd kit on CPAN also contains all source code for the examples in this document, tests, additional examples and other information you may find useful. AAAAddddddddiiiinnnngggg aaaa PPPPeeeerrrrllll iiiinnnntttteeeerrrrpppprrrreeeetttteeeerrrr ttttoooo yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm In a sense, perl (the C program) is a good example of embedding Perl (the language), so I'll demonstrate embedding with _m_i_n_i_p_e_r_l_m_a_i_n._c, included in the source distribution. Here's a bastardized, nonportable version of _m_i_n_i_p_e_r_l_m_a_i_n._c containing the essentials of embedding: #include <EXTERN.h> /* from the Perl distribution */ #include <perl.h> /* from the Perl distribution */ static PerlInterpreter *my_perl; /*** The Perl interpreter ***/ PPPPaaaaggggeeee 3333 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) int main(int argc, char **argv, char **env) { my_perl = perl_alloc(); perl_construct(my_perl); perl_parse(my_perl, NULL, argc, argv, (char **)NULL); perl_run(my_perl); perl_destruct(my_perl); perl_free(my_perl); } Notice that we don't use the env pointer. Normally handed to perl_parse as its final argument, env here is replaced by NULL, which means that the current environment will be used. Now compile this program (I'll call it _i_n_t_e_r_p._c) into an executable: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` After a successful compilation, you'll be able to use _i_n_t_e_r_p just like perl itself: % interp print "Pretty Good Perl \n"; print "10890 - 9801 is ", 10890 - 9801; <CTRL-D> Pretty Good Perl 10890 - 9801 is 1089 or % interp -e 'printf("%x", 3735928559)' deadbeef You can also read and execute Perl statements from a file while in the midst of your C program, by placing the filename in _a_r_g_v[_1] before calling _p_e_r_l__r_u_n. CCCCaaaalllllllliiiinnnngggg aaaa PPPPeeeerrrrllll ssssuuuubbbbrrrroooouuuuttttiiiinnnneeee ffffrrrroooommmm yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm To call individual Perl subroutines, you can use any of the ppppeeeerrrrllll____ccccaaaallllllll____**** functions documented in the _p_e_r_l_c_a_l_l manpage. In this example we'll use perl_call_argv. That's shown below, in a program I'll call _s_h_o_w_t_i_m_e._c. #include <EXTERN.h> #include <perl.h> static PerlInterpreter *my_perl; PPPPaaaaggggeeee 4444 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) int main(int argc, char **argv, char **env) { char *args[] = { NULL }; my_perl = perl_alloc(); perl_construct(my_perl); perl_parse(my_perl, NULL, argc, argv, NULL); /*** skipping perl_run() ***/ perl_call_argv("showtime", G_DISCARD | G_NOARGS, args); perl_destruct(my_perl); perl_free(my_perl); } where _s_h_o_w_t_i_m_e is a Perl subroutine that takes no arguments (that's the _G__N_O_A_R_G_S) and for which I'll ignore the return value (that's the _G__D_I_S_C_A_R_D). Those flags, and others, are discussed in the _p_e_r_l_c_a_l_l manpage. I'll define the _s_h_o_w_t_i_m_e subroutine in a file called _s_h_o_w_t_i_m_e._p_l: print "I shan't be printed."; sub showtime { print time; } Simple enough. Now compile and run: % cc -o showtime showtime.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % showtime showtime.pl 818284590 yielding the number of seconds that elapsed between January 1, 1970 (the beginning of the Unix epoch), and the moment I began writing this sentence. In this particular case we don't have to call _p_e_r_l__r_u_n, but in general it's considered good practice to ensure proper initialization of library code, including execution of all object DESTROY methods and package END {} blocks. If you want to pass arguments to the Perl subroutine, you can add strings to the NULL-terminated args list passed to _p_e_r_l__c_a_l_l__a_r_g_v. For other data types, or to examine return values, you'll need to manipulate the Perl stack. That's demonstrated in the last section of this document: the section on _F_i_d_d_l_i_n_g _w_i_t_h _t_h_e _P_e_r_l _s_t_a_c_k _f_r_o_m _y_o_u_r _C _p_r_o_g_r_a_m. PPPPaaaaggggeeee 5555 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) EEEEvvvvaaaalllluuuuaaaattttiiiinnnngggg aaaa PPPPeeeerrrrllll ssssttttaaaatttteeeemmmmeeeennnntttt ffffrrrroooommmm yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm Perl provides two API functions to evaluate pieces of Perl code. These are the perl_eval_sv entry in the _p_e_r_l_g_u_t_s manpage and the perl_eval_pv entry in the _p_e_r_l_g_u_t_s manpage. Arguably, these are the only routines you'll ever need to execute snippets of Perl code from within your C program. Your code can be as long as you wish; it can contain multiple statements; it can employ the use entry in the _p_e_r_l_f_u_n_c manpage, the require entry in the _p_e_r_l_f_u_n_c manpage, and the do entry in the _p_e_r_l_f_u_n_c manpage to include external Perl files. _p_e_r_l__e_v_a_l__p_v lets us evaluate individual Perl strings, and then extract variables for coercion into C types. The following program, _s_t_r_i_n_g._c, executes three Perl strings, extracting an int from the first, a float from the second, and a char * from the third. #include <EXTERN.h> #include <perl.h> static PerlInterpreter *my_perl; main (int argc, char **argv, char **env) { char *embedding[] = { "", "-e", "0" }; my_perl = perl_alloc(); perl_construct( my_perl ); perl_parse(my_perl, NULL, 3, embedding, NULL); perl_run(my_perl); /** Treat $a as an integer **/ perl_eval_pv("$a = 3; $a **= 2", TRUE); printf("a = %d\n", SvIV(perl_get_sv("a", FALSE))); /** Treat $a as a float **/ perl_eval_pv("$a = 3.14; $a **= 2", TRUE); printf("a = %f\n", SvNV(perl_get_sv("a", FALSE))); /** Treat $a as a string **/ perl_eval_pv("$a = 'rekcaH lreP rehtonA tsuJ'; $a = reverse($a);", TRUE); printf("a = %s\n", SvPV(perl_get_sv("a", FALSE), PL_na)); perl_destruct(my_perl); perl_free(my_perl); } All of those strange functions with _s_v in their names help convert Perl scalars to C types. They're described in the _p_e_r_l_g_u_t_s manpage. PPPPaaaaggggeeee 6666 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) If you compile and run _s_t_r_i_n_g._c, you'll see the results of using _S_v_I_V() to create an int, _S_v_N_V() to create a float, and _S_v_P_V() to create a string: a = 9 a = 9.859600 a = Just Another Perl Hacker In the example above, we've created a global variable to temporarily store the computed value of our eval'd expression. It is also possible and in most cases a better strategy to fetch the return value from _p_e_r_l__e_v_a_l__p_v() instead. Example: ... SV *val = perl_eval_pv("reverse 'rekcaH lreP rehtonA tsuJ'", TRUE); printf("%s\n", SvPV(val,PL_na)); ... This way, we avoid namespace pollution by not creating global variables and we've simplified our code as well. PPPPeeeerrrrffffoooorrrrmmmmiiiinnnngggg PPPPeeeerrrrllll ppppaaaatttttttteeeerrrrnnnn mmmmaaaattttcccchhhheeeessss aaaannnndddd ssssuuuubbbbssssttttiiiittttuuuuttttiiiioooonnnnssss ffffrrrroooommmm yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm The _p_e_r_l__e_v_a_l__s_v() function lets us evaluate strings of Perl code, so we can define some functions that use it to "specialize" in matches and substitutions: _m_a_t_c_h(), _s_u_b_s_t_i_t_u_t_e(), and _m_a_t_c_h_e_s(). I32 match(SV *string, char *pattern); Given a string and a pattern (e.g., m/clasp/ or /\b\w*\b/, which in your C program might appear as "/\\b\\w*\\b/"), _m_a_t_c_h() returns 1 if the string matches the pattern and 0 otherwise. int substitute(SV **string, char *pattern); Given a pointer to an SV and an =~ operation (e.g., s/bob/robert/g or tr[A-Z][a-z]), _s_u_b_s_t_i_t_u_t_e() modifies the string within the AV at according to the operation, returning the number of substitutions made. int matches(SV *string, char *pattern, AV **matches); Given an SV, a pattern, and a pointer to an empty AV, _m_a_t_c_h_e_s() evaluates $string =~ $pattern in an array context, and fills in _m_a_t_c_h_e_s with the array elements, returning the number of matches found. Here's a sample program, _m_a_t_c_h._c, that uses all three (long lines have been wrapped here): PPPPaaaaggggeeee 7777 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) #include <EXTERN.h> #include <perl.h> /** my_perl_eval_sv(code, error_check) ** kinda like perl_eval_sv(), ** but we pop the return value off the stack **/ SV* my_perl_eval_sv(SV *sv, I32 croak_on_error) { dSP; SV* retval; PUSHMARK(SP); perl_eval_sv(sv, G_SCALAR); SPAGAIN; retval = POPs; PUTBACK; if (croak_on_error && SvTRUE(ERRSV)) croak(SvPVx(ERRSV, PL_na)); return retval; } /** match(string, pattern) ** ** Used for matches in a scalar context. ** ** Returns 1 if the match was successful; 0 otherwise. **/ I32 match(SV *string, char *pattern) { SV *command = NEWSV(1099, 0), *retval; sv_setpvf(command, "my $string = '%s'; $string =~ %s", SvPV(string,PL_na), pattern); retval = my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); return SvIV(retval); } /** substitute(string, pattern) ** ** Used for =~ operations that modify their left-hand side (s/// and tr///) ** ** Returns the number of successful matches, and ** modifies the input string if there were any. **/ PPPPaaaaggggeeee 8888 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) I32 substitute(SV **string, char *pattern) { SV *command = NEWSV(1099, 0), *retval; sv_setpvf(command, "$string = '%s'; ($string =~ %s)", SvPV(*string,PL_na), pattern); retval = my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); *string = perl_get_sv("string", FALSE); return SvIV(retval); } /** matches(string, pattern, matches) ** ** Used for matches in an array context. ** ** Returns the number of matches, ** and fills in **matches with the matching substrings **/ I32 matches(SV *string, char *pattern, AV **match_list) { SV *command = NEWSV(1099, 0); I32 num_matches; sv_setpvf(command, "my $string = '%s'; @array = ($string =~ %s)", SvPV(string,PL_na), pattern); my_perl_eval_sv(command, TRUE); SvREFCNT_dec(command); *match_list = perl_get_av("array", FALSE); num_matches = av_len(*match_list) + 1; /** assume $[ is 0 **/ return num_matches; } main (int argc, char **argv, char **env) { PerlInterpreter *my_perl = perl_alloc(); char *embedding[] = { "", "-e", "0" }; AV *match_list; I32 num_matches, i; SV *text = NEWSV(1099,0); perl_construct(my_perl); perl_parse(my_perl, NULL, 3, embedding, NULL); sv_setpv(text, "When he is at a convenience store and the bill comes to some amount like 76 cents, Maynard is aware that there is something he *should* do, something that will enable him to get back a quarter, but he has no idea *what*. He fumbles through his red squeezey changepurse and gives the boy three extra pennies with his dollar, hoping that he might luck into the correct amount. The boy gives him back two of his own pennies and then the big shiny quarter that is his prize. -RICHH"); PPPPaaaaggggeeee 9999 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) if (match(text, "m/quarter/")) /** Does text contain 'quarter'? **/ printf("match: Text contains the word 'quarter'.\n\n"); else printf("match: Text doesn't contain the word 'quarter'.\n\n"); if (match(text, "m/eighth/")) /** Does text contain 'eighth'? **/ printf("match: Text contains the word 'eighth'.\n\n"); else printf("match: Text doesn't contain the word 'eighth'.\n\n"); /** Match all occurrences of /wi../ **/ num_matches = matches(text, "m/(wi..)/g", &match_list); printf("matches: m/(wi..)/g found %d matches...\n", num_matches); for (i = 0; i < num_matches; i++) printf("match: %s\n", SvPV(*av_fetch(match_list, i, FALSE),PL_na)); printf("\n"); /** Remove all vowels from text **/ num_matches = substitute(&text, "s/[aeiou]//gi"); if (num_matches) { printf("substitute: s/[aeiou]//gi...%d substitutions made.\n", num_matches); printf("Now text is: %s\n\n", SvPV(text,PL_na)); } /** Attempt a substitution **/ if (!substitute(&text, "s/Perl/C/")) { printf("substitute: s/Perl/C...No substitution made.\n\n"); } SvREFCNT_dec(text); PL_perl_destruct_level = 1; perl_destruct(my_perl); perl_free(my_perl); } which produces the output (again, long lines have been wrapped here) match: Text contains the word 'quarter'. match: Text doesn't contain the word 'eighth'. matches: m/(wi..)/g found 2 matches... match: will match: with PPPPaaaaggggeeee 11110000 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) substitute: s/[aeiou]//gi...139 substitutions made. Now text is: Whn h s t cnvnnc str nd th bll cms t sm mnt lk 76 cnts, Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt bck qrtr, bt h hs n d *wht*. H fmbls thrgh hs rd sqzy chngprs nd gvs th by thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct mnt. Th by gvs hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s hs prz. -RCHH substitute: s/Perl/C...No substitution made. FFFFiiiiddddddddlllliiiinnnngggg wwwwiiiitttthhhh tttthhhheeee PPPPeeeerrrrllll ssssttttaaaacccckkkk ffffrrrroooommmm yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm When trying to explain stacks, most computer science textbooks mumble something about spring-loaded columns of cafeteria plates: the last thing you pushed on the stack is the first thing you pop off. That'll do for our purposes: your C program will push some arguments onto "the Perl stack", shut its eyes while some magic happens, and then pop the results--the return value of your Perl subroutine--off the stack. First you'll need to know how to convert between C types and Perl types, with _n_e_w_S_V_i_v() and _s_v__s_e_t_n_v() and _n_e_w_A_V() and all their friends. They're described in the _p_e_r_l_g_u_t_s manpage. Then you'll need to know how to manipulate the Perl stack. That's described in the _p_e_r_l_c_a_l_l manpage. Once you've understood those, embedding Perl in C is easy. Because C has no builtin function for integer exponentiation, let's make Perl's ** operator available to it (this is less useful than it sounds, because Perl implements ** with C's _p_o_w() function). First I'll create a stub exponentiation function in _p_o_w_e_r._p_l: sub expo { my ($a, $b) = @_; return $a ** $b; } Now I'll create a C program, _p_o_w_e_r._c, with a function _P_e_r_l_P_o_w_e_r() that contains all the perlguts necessary to push the two arguments into _e_x_p_o() and to pop the return value out. Take a deep breath... #include <EXTERN.h> #include <perl.h> static PerlInterpreter *my_perl; PPPPaaaaggggeeee 11111111 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) static void PerlPower(int a, int b) { dSP; /* initialize stack pointer */ ENTER; /* everything created after here */ SAVETMPS; /* ...is a temporary variable. */ PUSHMARK(SP); /* remember the stack pointer */ XPUSHs(sv_2mortal(newSViv(a))); /* push the base onto the stack */ XPUSHs(sv_2mortal(newSViv(b))); /* push the exponent onto stack */ PUTBACK; /* make local stack pointer global */ perl_call_pv("expo", G_SCALAR); /* call the function */ SPAGAIN; /* refresh stack pointer */ /* pop the return value from stack */ printf ("%d to the %dth power is %d.\n", a, b, POPi); PUTBACK; FREETMPS; /* free that return value */ LEAVE; /* ...and the XPUSHed "mortal" args.*/ } int main (int argc, char **argv, char **env) { char *my_argv[] = { "", "power.pl" }; my_perl = perl_alloc(); perl_construct( my_perl ); perl_parse(my_perl, NULL, 2, my_argv, (char **)NULL); perl_run(my_perl); PerlPower(3, 4); /*** Compute 3 ** 4 ***/ perl_destruct(my_perl); perl_free(my_perl); } Compile and run: % cc -o power power.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % power 3 to the 4th power is 81. MMMMaaaaiiiinnnnttttaaaaiiiinnnniiiinnnngggg aaaa ppppeeeerrrrssssiiiisssstttteeeennnntttt iiiinnnntttteeeerrrrpppprrrreeeetttteeeerrrr When developing interactive and/or potentially long-running applications, it's a good idea to maintain a persistent interpreter rather than allocating and constructing a new interpreter multiple times. The major reason is speed: since Perl will only be loaded into memory once. PPPPaaaaggggeeee 11112222 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) However, you have to be more cautious with namespace and variable scoping when using a persistent interpreter. In previous examples we've been using global variables in the default package main. We knew exactly what code would be run, and assumed we could avoid variable collisions and outrageous symbol table growth. Let's say your application is a server that will occasionally run Perl code from some arbitrary file. Your server has no way of knowing what code it's going to run. Very dangerous. If the file is pulled in by perl_parse(), compiled into a newly constructed interpreter, and subsequently cleaned out with perl_destruct() afterwards, you're shielded from most namespace troubles. One way to avoid namespace collisions in this scenario is to translate the filename into a guaranteed-unique package name, and then compile the code into that package using the eval entry in the _p_e_r_l_f_u_n_c manpage. In the example below, each file will only be compiled once. Or, the application might choose to clean out the symbol table associated with the file after it's no longer needed. Using the perl_call_argv entry in the _p_e_r_l_c_a_l_l manpage, We'll call the subroutine Embed::Persistent::eval_file which lives in the file persistent.pl and pass the filename and boolean cleanup/cache flag as arguments. Note that the process will continue to grow for each file that it uses. In addition, there might be AUTOLOADed subroutines and other conditions that cause Perl's symbol table to grow. You might want to add some logic that keeps track of the process size, or restarts itself after a certain number of requests, to ensure that memory consumption is minimized. You'll also want to scope your variables with the my entry in the _p_e_r_l_f_u_n_c manpage whenever possible. package Embed::Persistent; #persistent.pl use strict; use vars '%Cache'; use Symbol qw(delete_package); sub valid_package_name { my($string) = @_; $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg; # second pass only for words starting with a digit $string =~ s|/(\d)|sprintf("/_%2x",unpack("C",$1))|eg; # Dress it up as a real package name $string =~ s|/|::|g; return "Embed" . $string; } PPPPaaaaggggeeee 11113333 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) sub eval_file { my($filename, $delete) = @_; my $package = valid_package_name($filename); my $mtime = -M $filename; if(defined $Cache{$package}{mtime} && $Cache{$package}{mtime} <= $mtime) { # we have compiled this subroutine already, # it has not been updated on disk, nothing left to do print STDERR "already compiled $package->handler\n"; } else { local *FH; open FH, $filename or die "open '$filename' $!"; local($/) = undef; my $sub = <FH>; close FH; #wrap the code into a subroutine inside our unique package my $eval = qq{package $package; sub handler { $sub; }}; { # hide our variables within this block my($filename,$mtime,$package,$sub); eval $eval; } die $@ if $@; #cache it unless we're cleaning out each time $Cache{$package}{mtime} = $mtime unless $delete; } eval {$package->handler;}; die $@ if $@; delete_package($package) if $delete; #take a look if you want #print Devel::Symdump->rnew($package)->as_string, $/; } 1; __END__ /* persistent.c */ #include <EXTERN.h> #include <perl.h> PPPPaaaaggggeeee 11114444 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) /* 1 = clean out filename's symbol table after each request, 0 = don't */ #ifndef DO_CLEAN #define DO_CLEAN 0 #endif static PerlInterpreter *perl = NULL; int main(int argc, char **argv, char **env) { char *embedding[] = { "", "persistent.pl" }; char *args[] = { "", DO_CLEAN, NULL }; char filename [1024]; int exitstatus = 0; if((perl = perl_alloc()) == NULL) { fprintf(stderr, "no memory!"); exit(1); } perl_construct(perl); exitstatus = perl_parse(perl, NULL, 2, embedding, NULL); if(!exitstatus) { exitstatus = perl_run(perl); while(printf("Enter file name: ") && gets(filename)) { /* call the subroutine, passing it the filename as an argument */ args[0] = filename; perl_call_argv("Embed::Persistent::eval_file", G_DISCARD | G_EVAL, args); /* check $@ */ if(SvTRUE(ERRSV)) fprintf(stderr, "eval error: %s\n", SvPV(ERRSV,PL_na)); } } PL_perl_destruct_level = 0; perl_destruct(perl); perl_free(perl); exit(exitstatus); } Now compile: % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ccopts -e ldopts` Here's a example script file: PPPPaaaaggggeeee 11115555 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) #test.pl my $string = "hello"; foo($string); sub foo { print "foo says: @_\n"; } Now run: % persistent Enter file name: test.pl foo says: hello Enter file name: test.pl already compiled Embed::test_2epl->handler foo says: hello Enter file name: ^C MMMMaaaaiiiinnnnttttaaaaiiiinnnniiiinnnngggg mmmmuuuullllttttiiiipppplllleeee iiiinnnntttteeeerrrrpppprrrreeeetttteeeerrrr iiiinnnnssssttttaaaannnncccceeeessss Some rare applications will need to create more than one interpreter during a session. Such an application might sporadically decide to release any resources associated with the interpreter. The program must take care to ensure that this takes place _b_e_f_o_r_e the next interpreter is constructed. By default, the global variable PL_perl_destruct_level is set to 0, since extra cleaning isn't needed when a program has only one interpreter. Setting PL_perl_destruct_level to 1 makes everything squeaky clean: PL_perl_destruct_level = 1; while(1) { ... /* reset global variables here with PL_perl_destruct_level = 1 */ perl_construct(my_perl); ... /* clean and reset _everything_ during perl_destruct */ perl_destruct(my_perl); perl_free(my_perl); ... /* let's go do it again! */ } When _p_e_r_l__d_e_s_t_r_u_c_t() is called, the interpreter's syntax parse tree and symbol tables are cleaned up, and global variables are reset. Now suppose we have more than one interpreter instance running at the same time. This is feasible, but only if you used the -DMULTIPLICITY flag when building Perl. By default, that sets PL_perl_destruct_level to PPPPaaaaggggeeee 11116666 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) 1. Let's give it a try: #include <EXTERN.h> #include <perl.h> /* we're going to embed two interpreters */ /* we're going to embed two interpreters */ #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)" int main(int argc, char **argv, char **env) { PerlInterpreter *one_perl = perl_alloc(), *two_perl = perl_alloc(); char *one_args[] = { "one_perl", SAY_HELLO }; char *two_args[] = { "two_perl", SAY_HELLO }; perl_construct(one_perl); perl_construct(two_perl); perl_parse(one_perl, NULL, 3, one_args, (char **)NULL); perl_parse(two_perl, NULL, 3, two_args, (char **)NULL); perl_run(one_perl); perl_run(two_perl); perl_destruct(one_perl); perl_destruct(two_perl); perl_free(one_perl); perl_free(two_perl); } Compile as usual: % cc -o multiplicity multiplicity.c `perl -MExtUtils::Embed -e ccopts -e ldopts` Run it, Run it: % multiplicity Hi, I'm one_perl Hi, I'm two_perl UUUUssssiiiinnnngggg PPPPeeeerrrrllll mmmmoooodddduuuulllleeeessss,,,, wwwwhhhhiiiicccchhhh tttthhhheeeemmmmsssseeeellllvvvveeeessss uuuusssseeee CCCC lllliiiibbbbrrrraaaarrrriiiieeeessss,,,, ffffrrrroooommmm yyyyoooouuuurrrr CCCC pppprrrrooooggggrrrraaaammmm If you've played with the examples above and tried to embed a script that _u_s_e()s a Perl module (such as _S_o_c_k_e_t) which itself uses a C or C++ library, this probably happened: PPPPaaaaggggeeee 11117777 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) Can't load module Socket, dynamic loading not available in this perl. (You may need to build a new perl executable which either supports dynamic loading or has the Socket module statically linked into it.) What's wrong? Your interpreter doesn't know how to communicate with these extensions on its own. A little glue will help. Up until now you've been calling _p_e_r_l__p_a_r_s_e(), handing it NULL for the second argument: perl_parse(my_perl, NULL, argc, my_argv, NULL); That's where the glue code can be inserted to create the initial contact between Perl and linked C/C++ routines. Let's take a look some pieces of _p_e_r_l_m_a_i_n._c to see how Perl does this: #ifdef __cplusplus # define EXTERN_C extern "C" #else # define EXTERN_C extern #endif static void xs_init _((void)); EXTERN_C void boot_DynaLoader _((CV* cv)); EXTERN_C void boot_Socket _((CV* cv)); EXTERN_C void xs_init() { char *file = __FILE__; /* DynaLoader is a special case */ newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file); newXS("Socket::bootstrap", boot_Socket, file); } Simply put: for each extension linked with your Perl executable (determined during its initial configuration on your computer or when adding a new extension), a Perl subroutine is created to incorporate the extension's routines. Normally, that subroutine is named _M_o_d_u_l_e::_b_o_o_t_s_t_r_a_p() and is invoked when you say _u_s_e _M_o_d_u_l_e. In turn, this hooks into an XSUB, _b_o_o_t__M_o_d_u_l_e, which creates a Perl counterpart for each of the extension's XSUBs. Don't worry about this part; leave that to the _x_s_u_b_p_p and extension authors. If your extension is dynamically loaded, DynaLoader creates _M_o_d_u_l_e::_b_o_o_t_s_t_r_a_p() for you on the fly. In fact, if you have a working DynaLoader then there is rarely any need to link in any other extensions statically. Once you have this code, slap it into the second argument of _p_e_r_l__p_a_r_s_e(): PPPPaaaaggggeeee 11118888 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) perl_parse(my_perl, xs_init, argc, my_argv, NULL); Then compile: % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts` % interp use Socket; use SomeDynamicallyLoadedModule; print "Now I can use extensions!\n"' EEEExxxxttttUUUUttttiiiillllssss::::::::EEEEmmmmbbbbeeeedddd can also automate writing the _x_s__i_n_i_t glue code. % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts` % cc -c interp.c `perl -MExtUtils::Embed -e ccopts` % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts` Consult the _p_e_r_l_x_s manpage and the _p_e_r_l_g_u_t_s manpage for more details. EEEEmmmmbbbbeeeeddddddddiiiinnnngggg PPPPeeeerrrrllll uuuunnnnddddeeeerrrr WWWWiiiinnnn33332222 At the time of this writing (5.004), there are two versions of Perl which run under Win32. (The two versions are merging in 5.005.) Interfacing to ActiveState's Perl library is quite different from the examples in this documentation, as significant changes were made to the internal Perl API. However, it is possible to embed ActiveState's Perl runtime. For details, see the Perl for Win32 FAQ at http://www.perl.com/perl/faq/win32/Perl_for_Win32_FAQ.html. With the "official" Perl version 5.004 or higher, all the examples within this documentation will compile and run untouched, although the build process is slightly different between Unix and Win32. For starters, backticks don't work under the Win32 native command shell. The ExtUtils::Embed kit on CPAN ships with a script called ggggeeeennnnmmmmaaaakkkkeeee, which generates a simple makefile to build a program from a single C source file. It can be used like this: C:\ExtUtils-Embed\eg> perl genmake interp.c C:\ExtUtils-Embed\eg> nmake C:\ExtUtils-Embed\eg> interp -e "print qq{I'm embedded in Win32!\n}" You may wish to use a more robust environment such as the Microsoft Developer Studio. In this case, run this to generate perlxsi.c: perl -MExtUtils::Embed -e xsinit Create a new project and Insert -> Files into Project: perlxsi.c, perl.lib, and your own source files, e.g. interp.c. Typically you'll find perl.lib in CCCC::::\\\\ppppeeeerrrrllll\\\\lllliiiibbbb\\\\CCCCOOOORRRREEEE, if not, you should see the CCCCOOOORRRREEEE directory relative to perl -V:archlib. The studio will also need this PPPPaaaaggggeeee 11119999 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) path so it knows where to find Perl include files. This path can be added via the Tools -> Options -> Directories menu. Finally, select Build -> Build interp.exe and you're ready to go. MMMMOOOORRRRAAAALLLL You can sometimes _w_r_i_t_e _f_a_s_t_e_r _c_o_d_e in C, but you can always _w_r_i_t_e _c_o_d_e _f_a_s_t_e_r in Perl. Because you can use each from the other, combine them as you wish. AAAAUUUUTTTTHHHHOOOORRRR Jon Orwant <_o_r_w_a_n_t@_t_p_j._c_o_m> and Doug MacEachern <_d_o_u_g_m@_o_s_f._o_r_g>, with small contributions from Tim Bunce, Tom Christiansen, Guy Decoux, Hallvard Furuseth, Dov Grobgeld, and Ilya Zakharevich. Doug MacEachern has an article on embedding in Volume 1, Issue 4 of The Perl Journal (http://tpj.com). Doug is also the developer of the most widely-used Perl embedding: the mod_perl system (perl.apache.org), which embeds Perl in the Apache web server. Oracle, Binary Evolution, ActiveState, and Ben Sugars's nsapi_perl have used this model for Oracle, Netscape and Internet Information Server Perl plugins. July 22, 1998 CCCCOOOOPPPPYYYYRRRRIIIIGGGGHHHHTTTT Copyright (C) 1995, 1996, 1997, 1998 Doug MacEachern and Jon Orwant. All Rights Reserved. Permission is granted to make and distribute verbatim copies of this documentation provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this documentation under the conditions for verbatim copying, provided also that they are marked clearly as modified versions, that the authors' names and title are unchanged (though subtitles and additional authors' names may be added), and that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this documentation into another language, under the above conditions for modified versions. PPPPaaaaggggeeee 22220000 PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPEEEERRRRLLLLEEEEMMMMBBBBEEEEDDDD((((1111)))) PPPPaaaaggggeeee 22221111